Skip to content

.NET: fix: HandoffAgentExecutor does not output any response when non-streaming#4745

Open
lokitoth wants to merge 3 commits intomainfrom
dev/dotnet_workflow/handoff_outputs_missing_in_thread
Open

.NET: fix: HandoffAgentExecutor does not output any response when non-streaming#4745
lokitoth wants to merge 3 commits intomainfrom
dev/dotnet_workflow/handoff_outputs_missing_in_thread

Conversation

@lokitoth
Copy link
Member

@lokitoth lokitoth commented Mar 17, 2026

Motivation and Context

Due to needing custom logic and a more delocalized state, the HandoffAgentExecutor is still separate from the general AIAgentHostExecutor. The fixes made to the AIAgentHostExecutor to better support both streaming and non-streaming agent invocation modes and separate configuration for outputs (streaming vs. nonstream, output vs. no output) in PR #3142 did not get ported to HandoffAgentExecutor in hopes of unifying the two.

Description

To better support Handoff workflows running in non-streaming mode (and in particular when hosted as an AIAgent), we change the HandoffAgentExecutor to always emit an AgentRunResponse if it is not configured to emitEvents on the TurnToken, which should be more properly called emitStreamingUpdates now that AgentRunResponseEvent exists.

Fixes #2423

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings March 17, 2026 14:08
@markwallace-microsoft markwallace-microsoft added .NET workflows Related to Workflows in agent-framework labels Mar 17, 2026
@github-actions github-actions bot changed the title fix: HandoffAgentExecutor does not output any reponse when non-streaming .NET: fix: HandoffAgentExecutor does not output any reponse when non-streaming Mar 17, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the specialized HandoffAgentExecutor to reliably produce workflow output in non-streaming mode (addressing the gap that existed compared to AIAgentHostExecutor improvements), and adds unit coverage to prevent regressions.

Changes:

  • Emit a final AgentResponse output from HandoffAgentExecutor when TurnToken.EmitEvents is not true (i.e., non-streaming).
  • Refactor existing AIAgentHostExecutor unit tests to share common assertions via a new base class.
  • Add a new unit test validating HandoffAgentExecutor output behavior across TurnToken.EmitEvents values.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
dotnet/src/Microsoft.Agents.AI.Workflows/Specialized/HandoffAgentExecutor.cs Emits a final AgentResponse output in non-streaming mode while preserving streaming-update behavior when enabled.
dotnet/tests/Microsoft.Agents.AI.Workflows.UnitTests/AIAgentHostExecutorTests.cs Extracts shared verification helpers into a base class and adds coverage for HandoffAgentExecutor output type selection.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses missing agent output in non-streaming Handoff workflows when hosted as an AIAgent, ensuring a final AgentResponse is emitted (vs. only streaming updates), and adds coverage for history persistence.

Changes:

  • Update HandoffAgentExecutor to emit an AgentResponse when TurnToken.EmitEvents is not true (treating the flag as “streaming vs. non-streaming” output selection).
  • Persist merged outgoing messages into the workflow session chat history for both streaming and non-streaming agent runs.
  • Refactor/extend unit tests to cover the expected output type and that outgoing messages are stored in history.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
dotnet/tests/Microsoft.Agents.AI.Workflows.UnitTests/WorkflowHostSmokeTests.cs Adds smoke coverage that outgoing messages are persisted to workflow session history; updates test inheritance.
dotnet/tests/Microsoft.Agents.AI.Workflows.UnitTests/AIAgentHostExecutorTests.cs Refactors shared assertions into a base class and adds targeted HandoffAgentExecutor output-type tests.
dotnet/src/Microsoft.Agents.AI.Workflows/WorkflowHostAgent.cs Persists merged responses into workflow session chat history after running (streaming + non-streaming).
dotnet/src/Microsoft.Agents.AI.Workflows/WorkflowChatHistoryProvider.cs Adds an accessor to retrieve all messages from session state (used by tests).
dotnet/src/Microsoft.Agents.AI.Workflows/Specialized/HandoffAgentExecutor.cs Emits a final AgentResponse in non-streaming mode while continuing to stream updates when configured.

You can also share your feedback on Copilot code review. Take the survey.

@lokitoth lokitoth force-pushed the dev/dotnet_workflow/handoff_outputs_missing_in_thread branch from 45f0d3e to f11fbf0 Compare March 17, 2026 15:12
@lokitoth lokitoth force-pushed the dev/dotnet_workflow/handoff_outputs_missing_in_thread branch from f11fbf0 to 9665b2d Compare March 17, 2026 15:26
@lokitoth lokitoth changed the title .NET: fix: HandoffAgentExecutor does not output any reponse when non-streaming .NET: fix: HandoffAgentExecutor does not output any response when non-streaming Mar 17, 2026
@lokitoth lokitoth force-pushed the dev/dotnet_workflow/handoff_outputs_missing_in_thread branch from 9665b2d to 68976a1 Compare March 19, 2026 15:07
Copy link

@chetantoshniwal chetantoshniwal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 86%

✗ Correctness

This PR refactors chat history management by moving AddMessages and UpdateBookmark calls from per-update (inside WorkflowSession.CreateUpdate and InvokeStageAsync's finally block) to post-completion (in WorkflowHostAgent's RunCoreAsync/RunCoreStreamingAsync). It also adds a new output mode for HandoffAgentExecutor where the full AgentResponse is yielded when streaming updates are not requested. The changes are largely sound, but the streaming path in RunCoreStreamingAsync lacks a try/finally guard, meaning chat history and bookmark updates are silently skipped if the consumer disposes the async enumerator early or cancellation occurs—a behavioral regression from the old finally block in InvokeStageAsync.

✗ Security Reliability

The diff moves chat-history persistence (AddMessages + UpdateBookmark) out of WorkflowSession.InvokeStageAsync (which had a try/finally) into the callers in WorkflowHostAgent. The non-streaming path (RunCoreAsync) is fine because the method is not an async iterator and fully consumes the enumerable. However, RunCoreStreamingAsync is an async iterator (it uses yield return), and the post-loop AddMessages/UpdateBookmark calls are placed after the await foreach loop without a try/finally. In C#, when a consumer disposes an async enumerator early (break, cancellation, exception), code after the yield-return loop is skipped—only pending finally blocks execute. This means on early termination the response messages are never written to history and the bookmark is never advanced, leaving the session in an inconsistent state. This is a reliability regression from the old code whose finally block guaranteed bookmark updates on any exit path.

✗ Test Coverage

The diff moves chat-history bookkeeping out of WorkflowSession.CreateUpdate/InvokeStageAsync and into WorkflowHostAgent.RunCoreAsync and the new RunCoreStreamingAsync path. Two new smoke tests verify that outgoing messages appear in the chat history after RunAsync (non-streaming), and a new HandoffAgentExecutor test covers the emit-events logic. However, there is no corresponding streaming-path test for the chat-history/bookmark change in RunCoreStreamingAsync, which is the most significant behavioural addition in this PR. The removal of the try/finally around UpdateBookmark also changes error-path behaviour that is untested.

✗ Design Approach

The diff moves chat history management from incremental per-update writes (inside WorkflowSession.CreateUpdate) to a deferred batch write after the full turn completes (in WorkflowHostAgent). This is a cleaner design in principle, but the streaming path has a correctness gap: history and bookmark updates only run after the await foreach loop, with no try/finally guard. If a consumer disposes the streaming IAsyncEnumerable early, those commits never execute—yet LastCheckpoint can be advanced mid-stream (via SuperStepCompletedEvent). This creates a divergence between workflow checkpoint state (advanced) and chat history bookmark (stale), potentially causing re-delivery of already-processed messages on the next turn. The old finally block in InvokeStageAsync existed precisely to prevent this. Separately, HandoffAgentExecutor repurposes TurnToken.EmitEvents—documented as controlling whether agent run events are raised—to also mean 'emit a batched full response instead of streaming updates'. The inline comment acknowledges this is a workaround due to a configuration limitation, which is itself a flag for a deeper design issue rather than a fix.

Flagged Issues

  • In WorkflowHostAgent.RunCoreStreamingAsync, AddMessages and UpdateBookmark are placed after the await foreach loop with no try/finally. Since this is an async iterator (uses yield return), if the consumer disposes the enumerator early (cancellation, break, or exception), post-loop code is skipped—only pending finally blocks execute. Meanwhile, WorkflowSession.LastCheckpoint can already be advanced mid-stream (by SuperStepCompletedEvent), so skipping UpdateBookmark leaves a divergence: the workflow has moved past a checkpoint but the history bookmark has not, causing re-delivery of already-consumed messages on the next turn. The old finally block in InvokeStageAsync prevented exactly this. A try/finally around the loop is needed to ensure chat history is updated on all exit paths.
  • The new smoke tests (Test_SingleAgent_AsAgent_OutgoingMessagesInHistoryAsync, Test_Handoffs_AsAgent_OutgoingMessagesInHistoryAsync) both call RunAsync, exercising only the non-streaming path. The streaming path in RunCoreStreamingAsync contains identical new logic (MessageMerger accumulation, AddMessages, UpdateBookmark) but has zero direct test coverage.

Suggestions

  • Add streaming-path equivalents of the two new smoke tests that call RunStreamingAsync, consume all updates, and verify the session's chat history and bookmark are updated afterwards.
  • The refactor also changes error-path behaviour: UpdateBookmark was previously in a finally block inside InvokeStageAsync, guaranteeing it ran even when the enumeration threw. It now sits after the foreach loop. Consider adding a test that verifies the intended behaviour when the inner workflow throws—is it intentional that the bookmark is not updated in that case?
  • HandoffAgentExecutor repurposes TurnToken.EmitEvents (documented as controlling whether agent run events are raised) to also decide whether to emit a full batched AgentResponse—a distinct concept. This will break callers that set EmitEvents = false to silence streaming events but do not expect a batched response injected into the output stream. A dedicated option on HandoffAgentExecutorOptions that controls batch-vs-streaming output mode would express the intent without overloading an unrelated flag's semantics.

Automated review by chetantoshniwal's agents

Comment on lines 142 to +157
await this.ValidateWorkflowAsync().ConfigureAwait(false);

WorkflowSession workflowSession = await this.UpdateSessionAsync(messages, session, cancellationToken).ConfigureAwait(false);
MessageMerger merger = new();

await foreach (AgentResponseUpdate update in workflowSession.InvokeStageAsync(cancellationToken)
.ConfigureAwait(false)
.WithCancellation(cancellationToken))
{
merger.AddUpdate(update);
yield return update;
}

AgentResponse response = merger.ComputeMerged(workflowSession.LastResponseId!, this.Id, this.Name);
workflowSession.ChatHistoryProvider.AddMessages(workflowSession, response.Messages);
workflowSession.ChatHistoryProvider.UpdateBookmark(workflowSession);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RunCoreStreamingAsync is an async iterator. If the caller breaks out of enumeration early (cancellation, break, exception), code after the await foreach loop is skipped—only pending finally blocks execute. This means AddMessages and UpdateBookmark are silently skipped on early termination. Worse, WorkflowSession.LastCheckpoint can already be advanced mid-stream (when SuperStepCompletedEvent fires), so the next call to GetFromBookmark will return messages the workflow has already consumed, causing a checkpoint/bookmark divergence. The old InvokeStageAsync finally block prevented this. Wrap the loop and post-loop code in a try/finally:

Suggested change
await this.ValidateWorkflowAsync().ConfigureAwait(false);
WorkflowSession workflowSession = await this.UpdateSessionAsync(messages, session, cancellationToken).ConfigureAwait(false);
MessageMerger merger = new();
await foreach (AgentResponseUpdate update in workflowSession.InvokeStageAsync(cancellationToken)
.ConfigureAwait(false)
.WithCancellation(cancellationToken))
{
merger.AddUpdate(update);
yield return update;
}
AgentResponse response = merger.ComputeMerged(workflowSession.LastResponseId!, this.Id, this.Name);
workflowSession.ChatHistoryProvider.AddMessages(workflowSession, response.Messages);
workflowSession.ChatHistoryProvider.UpdateBookmark(workflowSession);
WorkflowSession workflowSession = await this.UpdateSessionAsync(messages, session, cancellationToken).ConfigureAwait(false);
MessageMerger merger = new();
try
{
await foreach (AgentResponseUpdate update in workflowSession.InvokeStageAsync(cancellationToken)
.ConfigureAwait(false)
.WithCancellation(cancellationToken))
{
merger.AddUpdate(update);
yield return update;
}
}
finally
{
AgentResponse response = merger.ComputeMerged(workflowSession.LastResponseId!, this.Id, this.Name);
workflowSession.ChatHistoryProvider.AddMessages(workflowSession, response.Messages);
workflowSession.ChatHistoryProvider.UpdateBookmark(workflowSession);
}

ChatMessage[] sessionMessages = workflowSession.ChatHistoryProvider.GetAllMessages(workflowSession).ToArray();

// Since we never sent an incoming message, the expectation is that there should be nothing in the session
// except the response

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test validates the non-streaming path (RunAsync). The streaming path in RunCoreStreamingAsync now has identical AddMessages + UpdateBookmark logic (new in this PR) but no test. Please add a streaming variant that calls RunStreamingAsync, consumes all updates, then asserts GetAllMessages matches the streamed content.

ChatMessage[] responseMessages = response.Messages.ToArray();
ChatMessage[] sessionMessages = workflowSession.ChatHistoryProvider.GetAllMessages(workflowSession).ToArray();

// Since we never sent an incoming message, the expectation is that there should be nothing in the session

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same gap: the handoff-workflow history test only covers RunAsync. A streaming variant is needed to cover the new RunCoreStreamingAsync bookmark/history logic.

Comment on lines +259 to +262
if (message.TurnToken.EmitEvents is not true)
{
await context.YieldOutputAsync(agentResponse, cancellationToken).ConfigureAwait(false);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TurnToken.EmitEvents is documented as controlling whether agent run events (streaming updates) are raised. Here it is repurposed to also decide whether to emit a full batched AgentResponse—a distinct concept. This will break callers that set EmitEvents = false to silence streaming events but do not expect a batched response injected into the output stream. A dedicated option on HandoffAgentExecutorOptions that controls batch-vs-streaming output mode would express the intent without overloading an unrelated flag's semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

.NET workflows Related to Workflows in agent-framework

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.NET - Assistant responses are not stored when using Workflow as Agent with Handoff orchestration

5 participants